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1 ATGGGGGAGA TGCAGGGCGC GCTGGCCAGA GCCCGGCTCG AGTCCCTGCT 
51 GCGGCCCCGC CACAAAAAGA GGGCCGAGGC GCAGAAAAGG AGCGAGTCCT 
101 TCCTGCTGAG CGGACTGGCT TTCATGAAGC AGAGGAGGAT GGGTCTGAAC 
151 GACTTTATTC AGAAGATTGC CAATAACTCC TATGCATGCA AACACCCTGA 
201 AGTTCAGTCC ATCTTGAAGA TCTCCCAACC TCAGGAGCCT GAGCTTATGA 
251 ATGCCAACCC TTCTCCTCCA CCAAGTCCTT CTCAGCAAAT CAACCTTGGC 
301 CCGTCGTCCA ATCCTCATGC TAAAGCATCT GACTTTCACT TCTTGAAAGT 
351 GATCGGAAAG GGCAGTTTTG GAAAGGTTCT TCTAGCAAGA CACAAGGCAG 
401 AAGAAGTGTT CTATGCAGTC AAAGTTTTAC AGAAGAAAGC AATCCTGAAA 
451 AAGAAAGAGG AGAAGCATAT TATGTCGGAG CGGAATGTTC TGTTGAAGAA 
501 TGTGAAGCAC CCTTTCCTGG TGGGCCTTCA CTTCTCTTTC CAGACTGCTG 
551 ACAAATTGTA CTTTGTCCTA GACTACATTA ATGGTGGAGA GTTGTTCTAC 
601 CATCTCCAGA GGGAACGCTG CTTCCTGGAA CCACGGGCTC GTTTCTATGC 
651 TGCTGAAATA GCCAGTGCCT TGGGCTACCT GCATTCACTG AACATCGTTT 
701 ATAGAGACTT AAAACCAGAG AATATTTTGC TAGATTCACA GGGACACATT 
751 GTCCTTACTG ACTTCGGACT CTGCAAGGAG AACATTGAAC ACAACAGCAC 
[a 801 AACATCCACC TTCTGTGGCA CGCCGGAGTA TCTCGCACCT GAGGTGCTTC 
O 851 ATAAGCAGCC TTATGACAGG ACTGTGGACT GGTGGTGCCT GGGAGCTGTC 
O 901 TTGTATGAGA TGCTGTATGG CCTGCCGCCT TTTTATAGCC GAAACACAGC 
!: r j 951 TGAAATGTAC GACAACATTC TCAACAAGCC TCTCCAGCTG AAACCAAATA 
:j 1001 TTACAAATTC CGCAAGACAC CTCCTGGAGG GCCTCCTGCA GAAGGACAGG 
ill 1051 ACAAAGCGGC TCGGGGCCAA GGATGACTTC ATGGAGATTA AGAGTCATGT 
J 1101 CTTCTTCTCC TTAATTAACT GGGATGATCT CATTAATAAG AAGATTACTC 
;; % 1151 CCCCTTTTAA CCCAAATGTG AGTGGGCCCA ACGACCTACG GCACTTTGAC 
O 1201 CCCGAGTTTA CCGAAGAGCC TGTCCCCAAC TCCATTGGCA AGTCCCCTGA 
I'll 1251 CAGCGTCCTC GTCACAGCCA GCGTCAAGGA AGCTGCCGAG GCTTTCCTAG 
Q 1301 GCTTTTCCTA TGCGCCTCCC ACGGACTCTT TCCTCTGA (SEQ ID NO:l) 

9 FEATURES: 
iU Start Codon: 1 
Stop Codon: 1336 



Homologous proteins: 

TOP 10 BLAST Hits 

Score E 

CRA 1 67000082668077 gi 114756346 /def=ref |XP_037046. 1| (XM... 843 0.0 

CRA | 18000005074572 gi | 5032091 /def=ref | NP_005618 . 1 | (NM_. . . 841 0 . 0 

CRA 118000004907445 gi 1 477098 /def=pir | IA48094 serum and ... 829 0.0 

CRA 1 150000165029864 gi 113431833 /def=sp | Q9XT18 1 SQCRABTT. . . 826 0.0 

CRA| 18000005246968 gi 16755490 /def=ref | NP_035491. 1| (NM_... 824 0.0 

CRA | 18000004937507 gi 19507093 /def=ref |NP_062105. 1| (NM_... 822 0.0 

CRA | 18000005171986 gi | 3688803 /def=gb | AAC62398 . 1 | (AF057 ... 776 0.0 

CRA 1 18000005144813 gi 1 3116066 /def=emb | CAA11528 . 1 1 (AJ22 . . . 711 0.0 

CRA 1 18000005144812 gi 1 3116064 /def=emb | CAA11527 . 1 1 (A322 . . . 709 0.0 

CRA | 335001098677651 gi | 11321321 /def =gb | AAG34115 . 1 | AF312 ... 579 e-164 
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Rl^t hits to dbEST: 

tpa Nwrtiher gi Number Srorp Expect 

CRA| 63000075619018 gi 114816226 1562 0.0 

CRA| 225000014874111 gi I 18504859 1503 

CRAl 225000001750119 gi I 15756574 1441 

CRAl 159000009754651 gi I 13582978 1417 

CRAl 158000041295522 gi 110994817 1407 

CRA| 58000099052833 gi I 12793499 1402 0.0 

CRAl 66000078090204 gi 115017913 1402 0.0 

CRAl 225000014943770 gi 118509828 1392 

CRA| 63000075528266 gi 114809983 1368 0.0 

CRAl 78000169320891 gi 114073071 1364 0.0 

CRA| 335001063053989 gi 110937149 1296 

CRAl 78000106804089 gi 110390589 1285 0.0 

CRAl 118000029469319 gi I 10933084 1259 

CRAl 11000545765847 gi 19176035 1239 

CRA| 222000003126349 gi I 16520713 1225 

CRA| 158000041316197 gi I 10996884 1223 

CRAl 55000120105106 gi 113994615 1197 0.0 

CRAl 222000003339745 gi I 16529516 1191 

CRAl 78000169332857 gi 114074159 1183 0.0 

CRAl 224000000151825 gi 1 15161415 1181 

CRAl 158000041310407 gi I 10996305 1174 

CRA| 224000004588220 gi I 15950481 1172 

CRAl 98000052723591 gi 113976289 1158 0.0 

CRA| 222000000718505 gi I 15248907 1156 

CRAl 78000105668173 gi 110352924 1152 0.0 

CRAl 164000029925315 gi I 11001544 1146 

CRAl 164000029914485 gi 111000461 1132 

CRA| 107000020244468 gi 19330126 1132 

CRAl 55000120090850 gi 113993063 1132 0.0 

CRA| 158000041316657 gi I 10996930 1122 

CRAl 78000106872608 gi 110398650 1122 0.0 

CRAl 165000138685182 gi I 14292101 1100 

CRAI164000029922335 gi 111001246 1094 

CRA| 335001063029672 gi | 10947466 1094 

CRAI113000083010569 gi 112041410 1086 

CRA| 222000001551530 gi 1 15446801 1070 

CRAl 78000105702017 gi 110357421 1053 0.0 

CRA| 107000020410692 gi 19345242 1049 

CRA| 107000020408053 gi 1 9345002 1049 

CRAl 196000006451565 gi 1 12101876 1045 
CRAl 154000034710284 gi I 10331724 1031 
CRA| 58000099305099 gi 112895306 1025 0.0 
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CRAl 222000001550936 gi 115446747 1021 0.0 
CRAI78O00105667113 gi 110352757 1017 0.0 
CRA| 3000001081732 gi 12877701 1009 0.0 
CRA| 63000075437648 gi I 14804614 
CRA|114000006533071 gi 18142744 
CRA| 149000089175289 gi 113402435 
CRAI155000041341862 gi 110144531 
CRA| 1000490895790 gi 1 5433043 
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0.0 



EXPRESSION INFORMATION FOR MODULATORY USE: 

library source: 



gi Number 



gi 
gi 
gi 
gi 



113994615 
110999153 
110947466 
110996930 
115438670 



Organ 



Tissue Type 



brain 
PLACFJ. 
MAMMAL 
PLACFJ. 

brain 



hypothalamus 

placenta 

mammary gland 

placenta 
hippocampus 
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1 MGEMQGALAR ARLESLLRPR HKKRAEAQKR SESFLLSGLA FMKQRRMGLN 
51 DFIQKIANNS YACKHPEVQS ILKISQPQEP ELMNANPSPP PSPSQQINLG 
101 PSSNPHAKPS DFHFLKVIGK GSFGKVLLAR HKAEEVFYAV KVLQKKAILK 
151 KKEEKHIMSE RNVLLKNVKH PFLVGLHFSF QTADKLYFVL DYTNGGELFY 
201 HLQRERCFLE PRARFYAAEI ASALGYLHSL NIVYRDLKPE NILLDSQGHI 
251 VLTDFGLCKE NIEHNSTTST FCGTPEYLAP EVLHKQPYDR TVDWWCLGAV 
301 LYEMLYGLPP FYSRNTAEMY DNILNKPLQL KPNITNSARH LLEGLLQKDR 
351 TKRLGAKDDF MEIKSHVFFS LINWDDLINK KITPPFNPNV SGPNDLRHFD 
401 PEFTEEPVPN SIGKSPDSVL VTASVKEAAE AFLGFSYAPP TDSFL (SEQ ID NO: 2) 



FEATURES: 

Functional domains and key regions: 
Prosite results: 

PDOCOOOOl PS00001 ASN_GLYCOSYLATTON 
N-glycosylation site 
N> Number of matches: 4 



1 


58-61 


NNSY 


2 


265-268 


NSTT 


3 


333-336 


NITN 


4 


389-392 


NVSG 



"4 

j PDOC00004 PS00004 CAMP_PHOSPH03ITE 

camp- and cGMP-dependent protein kinase phosphorylation site 
□ 380-383 KKTT 

ry 

J3 PDOC00005 PS00005 PKC_PHOSPHO_jSITE 

JijO Protein kinase C phosphorylation site 

;d Number of matches: 4 



1 


159-161 


SER 


2 


337-339 


SAR 


3 


351-353 


TKR 


4 


424-426 


SVK 



PDOC00006 PS00006 CK2_PH0SPH0_3ITE 
Casein kinase II phosphorylation site 
424-427 SVKE 

PDOC00007 PS00007 TYR_PHOSPHQjSITE 
Tyrosine kinase phosphorylation site 
130-138 RHKAEEVFY 
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PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 

175-180 GLHFSF 

PDOC00100 PS00107 PROTEIN_KINASE_ATP 
Protein kinases ATP-binding region signature 

118-150 IGKGSFGKVLU^KAEEVFYAVKVLQKKAILK 

PDOC00100 PS00108 PROTEIN_KINASE_ST 
Serine/Threonine protein kinases active-site signature 
232-244 IvYRDLKPENILL 



MPttibrane spanning stru cture and domains: 
Helix Begin End Score Certainty 
1 293 313 0.948 Putative 

BLAST Alignment to Top Hit: M ^ Ar 1 1 

>CRA| 67000082668077 gi 114756346 /def=ref I XP 037046.11 

(XNL037046) serum/glucocorticoid regulated kinase [Homo 
sapiens] /org=Homo sapiens /taxon=9606 /div=PRl 
/dataset=nraa /length=431 
Length = 431 

Score = 843 bits (2154), Expect = 0.0 

Identities = 406/407 (9956), Positives = 407/407 (9990 

Frame = +1 

ouerv- 292 lafnikqrjm;wdfi^ 471 
query. .[afmkqrrmglndfiq ^ 

Sbjct: 25 lARVKQRRMGI^DFIQ^ 84 
Ouerv 472 LGPSSNPHAKPSDFHFLKVIGKG£FGKVLL^ 651 

query. lgpssnphakpsdfhflkvigkgsfgkvlu^kaee^ 

sbjct: 85 lgpssnphakpsdfhflkvigkgs fgkvllarhkaeevfyavkvlqkkai LKKKEEKHIM 144 

Query 652 SERNVLLKNVKHPFLVGLHFSFX^^ 831 

sWlK^PFLVGWFSF^^^ 
Sbjct: 145 SEF^LLKNVKHPFLVGLHFSFQTA^ 204 

Query- 832 eiasalc^hslnivyrdlkp^ 10U 
* eiasalc^lhslnivyrdl^ 

Sbjct- 205 EIASALGYUHSLNIVYTOLKPENILL^ 264 
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Query: 1012 apevlhkqpydrtvtdwi/CLGAVLYEMLY^ 1191 

APEVLHKQPYDRTVDWIC^^ 
Sbjct: 265 APEVLHKQPYDRTVDWUCLGAVL^^ 324 

Query: 1192 RHLLEGLLQKDRTKMXi^KDD 1371 

RHLLEGLLQKDRTKRLGAKDDFW^ 
Sbjct: 325 RHLLEGLLQKDRTKRLGAKDDFMEIKSHVFFSLIIVIV^ 384 

Query: 1372 FDPEFTEEPVPNSIGKSPDSVLVTASVKEAAEAFLGFSYAPPTDSFL 1512 

FDPEFTEEPVPNSIGIGPDSVLVTASVKEAAEAFLGFSYAPPTDSFL 
Sbjct: 385 FDPEFTEEPVPNSIGKSPDSVLVTASVKEAAEAFLGFSYAPPTDSFL 431 (SEQ ID NO: 4) 



Hmmer search results (Pfam): 

Scores for sequence family classification (score includes all domains): 



Model 


Description 


Score 


E- value 


N 


PF00069 


Eukaryotic protein kinase domain 


298.1 


l.le-85 


1 


PF00433 


Protein kinase C terminal domain 


56.0 


5.6e-16 


1 


CE00022 


CE00022 MAGUrCsubfamily_d 


24.7 


3.5e-07 


1 


CE0O359 


E00359 l3one_morphogeneti cprotei n_receptor 


14.6 


0.0019 


2 


PF00787 


PX domain 


7.2 


2.4 


1 


CE00031 


CE00031 VEGFR 


0.6 


2.7 


1 


CE00292 


CE00292 PTKjTTembrane_span 


-49.8 


3.8e-06 


1 


CE00287 


CE00287 PTK_Eph_orphani_receptor 


-53.1 


7.6e-05 


1 


CE00289 


CE00289 PTK_PDGF_receptor 


-67.6 


0.37 


1 


CE00291 


CE00291 PTK_fgf_receptor 


-87.2 


0.00097 


1 


CE00286 


E00286 PTK_EGF_recepto r 


-88.2 


le-05 


1 


CE00290 


CE00290 PTK_Trk_fanrily 


-153.3 


9.1e-05 


1 


CE00016 


CE00016 GSK_gl ycogen_synthase_ki nase 


-159.4 


8.3e-08 


1 



Parsed for domains: 



Model 


Domain 


seq-f 


seq-t 


hmm-f hmm-t 




score 


E-value 


PF00787 


1/1 


40 


75 .. 


108 


147 


■ ] 


7.2 


2.4 


CE00359 


1/2 


115 


143 .. 


144 


175 


■ ■ 


0.5 


19 


CE00289 


1/1 


110 


212 .. 


1 


109 


n 


-67.6 


0.37 


CE00031 


1/1 


216 


260 .. 


1051 


1095 


■ ■ 


0.6 


2.7 


CE00359 


2/2 


232 


283 .. 


272 


327 




12.8 


0.0064 


CE00290 


1/1 


112 


341 .. 


1 


282 


ii 


-153.3 


9. le-05 


CE00291 


1/1 


112 


345 .. 


1 


285 


n 


-87.2 


0.00097 


CE00286 


1/1 


112 


349 .. 


1 


263 


n 


-88.2 


le-05 


CE00292 


1/1 


112 


354 .. 


1 


288 


n 


-49.8 


3.8e-06 


CE00022 


1/1 


223 


357 .. 


133 


275 




24.7 


3.5e-07 


CE00287 


1/1 


112 


367 .. 


1 


260 


ii 


-53.1 


7.6e-05 


PF00069 


1/1 


112 


369 .. 


1 


278 


n 


298.1 


l.le-85 


CE00016 


1/1 


41 


442 .. 


1 


433 


[] 


-159.4 


8.3e-08 


PF00433 


1/1 


370 


AAA 
ii r ■ ■ 


1 


70 


n 


56.0 


5.6e-16 
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1 TCTGGCTCGT GCTCTCATGT CATCTCAGAG TTCCAGCTTA TCAGAGGCAT 
51 GTAGCAGGGA GGCTTATTCC AGCCATAACT GGGCTCTACC TCCAGCCTCC 
101 AGAAGTAATC CCCAACCTGC ATATCCTTGG GCAACCCGAA GAATGAAAGA 
151 AGAAGCTATA AAACCCCCTT TGAAAGGTTC GTACTTACCG TACTATATTT 
201 TGCAGATGCC TCAAAGGATT TGGGGTTACT TGGCATGGGG AAGGCACATA 
251 AGGTGGGGTG TAGGAGAGGG TCTCTGGTTG TAGGTTTCTT AATTTAATGT 
301 TTGAAAACAA ACATGCAAAA GTCTGTGTGC AGGTTGATGT TTCTGGGCAG 
351 CCTGAGCAAA ATTTGCTCTC TCAAGAGGGA AAGGAACCAG GTGGGAGCAG 
401 AGCTAGGCTG GGCTAGGCTA GTTGAATGGT GGGACATGAC ATACGGGTGG 
451 CACTGGCAAT AACAAAGTCA CATTCTATGA AGATTCCCTG CAAGAGGAAG 
501 CAGACATGGG CCAGTTACTG TGATTTGAAA TTGCCTAAAC ATTGCTTTAG 
551 GTTGGCATGT CAATTTCAGG TACTAGTGTT I 1 1 1 1 IGI 1 1 TTGTTTTTGT 
601 1 1 IGI I 1 1 IG 1 1 IGI I IGI I TGTTTTGAGA CGGAGTCTCG CTCTGTTGCC 
651 AGGCTGGAGT GCAGTGGCGT GATCTCGGCT CACTGCAACC TCCGCCTCCC 
701 GGGTTCAAGC GATTCTCCTG CCTCAGCCTC CCGAGTAACT GGGACTACAG 
751 GCGCACGCCA CCACGCCTGG CTAATTTTTC TATTTTCAGT AGAGACGGGG 
801 TTTCACCATG TTAGCCAGGA TGGTCTCGAT CTCTTGACCT CGTGATCCGC 
851 CCGCOTGGC CTCCCAAAGT GCTGGGATTA CAGGCGTGAG CCACTGCGCC 
901 CGGCCCCAGT AAATGCTTTT TATAAGTGTG GGCACTGAGC AAACTTTCCC 
951 AGCCAGACTC CAGGAGAGAG AATGTGTTTC CCTTCTCTCG GTTTGGGGCT 
1001 GTTGCAACAA AGCAAACCAA GGAGTTGAGA CTAGAGCTCA CTTTAGGGCA 
1051 AGTGGGGGTG GTTTTGCCTG CAAAACAAAC CCCTGCCCAA GACCAAGGAA 
1101 AAGGCGTTTC ACATGCTATT CCTGGTTTGA CAGCTGGTAT TTCGGGACTG 
1151 TGCCAGATCC AGTAGGCAAC TTTAAAATGG CAGAGCCTTT GGTAGCAAGA 
1201 GGTCATGGCA GGGCAGCCAC CGCAGACAGC AACAGCGAGC GCCAGGTACC 
1251 TGGCCCTGCG AATAGTGGTA ACTTGTAACT GCCCGCTCCG GGCCCAGTCG 
1301 CTGTGCTCGC GGCTTCCCGG CCAGCACTGG CTCACGTCCC CGCGCCGGCG 
1351 GTCAGGCTCC GGCTCCCAGA CATCCCCCAG CCGCGGGGTT ACTGGAAGGC 
1401 ACCGGCATCG CTGTTCTGCA GAGCCCGGGC CGCCGCCTCG AGCTTCCCTC 
1451 TCTTCCCTGC CTTCTGCAGC GGAGTCACCC GGCTAATCTT TCAGGATAAA 
1501 GTCACAGTTT ATGTGGGACT CACATAAAGA GCGAGCGAGG TGGCAAAACT 
1551 AAGAAGCCCT GGGGCAGCCT TGAGTTAAAC CCAGGGAGGG TAGGGACGAT 
1601 TTTAAGACCA TGTATCATGA CCTGCAGGGT TTTCAGGTGG GACAGCGGGA 
1651 GAGGAGCAGG CCCCACAGAG GAATCGAGGA TGCCCGGTTC ACGCCAGGTC 
1701 TGCCCCCGGG CAAAGCTACC CCTCCCTTCG CTTGTTACCT CCTCACGTGT 
1751 TCTTGGCATG GCAGAGATTA AAAATGCAAG GAAAAAAATT ACATGCGGAA 
1801 CGGACAAAAT GTTCTCAGAG ATTACTTCAG AAAAAAAAAA GTGAAATGCA 
1851 GATTGTACTT CTTCCTTTAG TGCAGAGACG ACTTTTATTT CCGCCCCCTC 
1901 CCCTCCACAT TCCTGACCTC TCCCTCCCCC TTTTCCCTCT TTCTTTCCTT 
1951 CCTTCCTCCT CTTCCAAGTT CTGGGATTTT TCAGCCTTGC TTGGTTTTGG 
2001 CCAAAAGCAC AAAAAAGGCG TTTTCGGAAG CGACCCGACC GTGCACAAGG 
2051 GCCAI I IGI I TGTTTTGGGA CTCGGGGCAG GAAATCTTGC CCGGCCTGAG 
2101 TCACGGCGGC TCCTTCAAGG AAACGTCAGT GCTCGCCGGT CGCTCTCGTC 
2151 TGCCGCGCGC CCCGCCGCCC GCTGCCCATG GGGGAGATGC AGGGCGCGCT 
2201 GGCCAGAGCC CGGCTCGAGT CCCTGCTGCG GCCCCGCCAC AAAAAGAGGG 
2251 CCGAGGCGCA GAAAAGGAGC GAGTCCTTCC TGCTGAGCGG ACTGGGTAAG 
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2301 CGCCGCCGCC GGCCCCGCTG GGQGCTTGGC TCACTTCCCC AGAGCGGCTT 
2351 GGAGGCAGGG GCCGGCTTTC GTCGGAGTTC TCGGGGCCGG GGTCCCGGCG 
2401 GCGGGAACGG GAGGACCTGG CGGGCGAGGT CGCGCGCGCA QGCCTGCGCC 
2451 CCAGGGATAA ACCCCGGAGG GTQGCGCGCA CCGCCGGCTC GGGTTGGGGA 
2501 GGAGGGTGGG AGTCCGGCCG CAGGACGGCG CCTGGCCGGG GAGAGGGTAT 
2551 CTGCAGGGAC AGTGAGCGAA GCCACCGTGG CCGCCGCGCA CCCGCCGGGA 
2601 AGCGCTTCGG CGCTGCGAAC CCGGCTTTCT CCGGCGGCGG AATAAATGAG 
2651 AGAGGTGGAA AACTACCCCG GGCTCTCCGG CCCTCCCCGC GCCCTCCGCC 
2701 GGCGCGTTCT CTCTCTCCTG CCCCAGGAGC CGATGGAGAC TGATAACGGC 
2751 CCTGCGCCAG GCCGTCCCCG GGCGGTCCTC GCGCCCCCGC CCGGGGCTCG 
2801 CCCTCTCAAT GGGGACAGAA CCGCCCGCCG CAGGCAGCGT AGCCGCCAGC 
2851 AAACCGCGAG GCGGTCGGGG CGGGGCGAGG GGCGAGGCGA AGGGCGGGGC 
2901 CACTTCTCAC TGTCGCGCAG GCCCCGCCCC CGCGGCGGTG CC I lilt I AT 
2951 AAGGCCGAGC GCGCGGCCTG GCGCAGCATA CGCCGAGCCG GTCTTTGAGC 
3001 GCTAACGTCT TTCTGTCTCC CCGCGGTGGT GATGACGGTG AAAACTGAGG 
3051 CTGCTAAGGG CACCCTCACT TACTCCAGGA TGAGGGGCAT GGTGGCAATT 
3101 CTCATCGGTG AGTGCAGGAA TCTTGCGGGA CTTCTGCTCC AGGAGACGCA 
3151 AAGTGGAAAT I 1 1 I IGAAAG TCCCGGATCA GATTAGTGTG TGTGGCGCCG 
3201 GACGTTATGA AGCCGTCTAA ACGTTTCTTT ATTTCTCCTC CTTCATCCAC 
3251 AGCTTTCATG AAGCAGAGGA GGATGGGTCT GAACGACTTT ATTCAGAAGA 
3301 TTGCCAATAA CTCCTATGCA TGCAAACAGT AAGTTTGACC GGATTTGAGG 
3351 AAATAACTAG TATAGTTTGA ATTTGCCAGC GGTAAACATT CTCATCACGG 
3401 CGTTTATCGG GAAGGCGAAG ACTTCTTCTG GGGTGGGGAT CTCATTTCTC 
3451 CTTAAATTCT AATATATTTG ACACATTTTA AACATTAAAG TTAATTTGCT 
3501 GATTTGGCTT GAACTGGAGA TGTAAGATAA ATGGTTCGTG TTGGCCGAAT 
3551 TCACGGCCTT TCTCCATGAG CAACAATCCT TATTTCTGTA TTTAATGGGG 
3601 TTTATTATTT TCTTTAACTG ACTAATGTAT TGGGGTATTT TCAGTTTAAA 
3651 CAGTGAATTA TCCGGGTAGA AGTCGGTAGA GCCAGGAAAC TCACTTTTGA 
3701 TGTTGGTGTG CCCCCTAGTG GCGAGCTGGA TTCTAAATCG TGCCCTTTAT 
3751 TCCCTGCAGC CCTGAAGTTC AGTCCATCTT GAAGATCTCC CAACCTCAGG 
3801 AGCCTGAGCT TATGAATGCC AACCCTTCTC CTCCAGTAAG till IGTATG 
3851 TGCCGTGCAT CTGTGGAGAA CTGTAAGGGA GTCAGTTAGT ATTCCTACAT 
3901 TAATGGATTA AAATAGCATT TCTAGAAATT AGTATCAAGG CAGGAATGCT 
3951 TCATTATGGC ATAACAAGTG ATATAAATAT TTAAGTATTG AGTCAGAGTA 
4001 TTATTTTATT 1 1 1 1 1 CCTGG GCATATTTTA CCTCCAAAGT GGTTATTTTA 
4051 AAAGGCATAT TTCATAAAAA GGTTTTATCT GTCTGAAACA ACATGACTGT 
4101 GTGCAGTTTC CATACTCATT TGAAATGTGA TGAAATGTAG TTTTGAATGT 
4151 TTATAGATGT ATGGTCATTT GCATCAGTCA TTTGTAGATG TAACATTTTC 
4201 TACATCGTTT ATGTTATAGA TGTCTTCCTT TGAAGCAATG GTATTAAAAG 
4251 AMTTCTTTT 1 1 1 1 1 1 1 1 IC TAGCCAAGTC GTCTCAGCA AATCAACCTT 
4301 GGCCCGTCGT CCAATCCTCA TGCTAAACCA TCTGACTTTC ACTTCTTGAA 
4351 AGTGATCGGA AAGGGCAGTT TTGGAAAGGT AATTTCAAAT CTGAAGATCT 
4401 TTTGGTACAC TTCCTTCATG TCCTCTTTTA TATTCTCCCT GGATGAGGAT 
4451 AGAAAAATGA 1 1 1 1 1 1 1 AAA TTGAAATTTC AGGTTCTTCT AGCAAGACAC 
4501 AAGGCAGAAG AAGTGTTCTA TGCAGTCAAA GTTTTACAGA AGAAAGCAAT 
4551 CCTGAAAAAG AAAGAGGTAT GAGATGTGCT TGATGGGGCT GGCATTGGCG 
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4601 GTAGACACTC CTTGAATAAT CTTGATTCTG GAATGTTGGT GCCAAGTTGA 
4651 AACATGCCAC TAAATCTGAA TCGTCATTTT CCTAGGAGAA GCATATTATG 
4701 TCGGAGCGGA ATGTTCTGTT GAAGAATGTG AAGCACCCTT TCCTGGTGGG 
4751 CCTTCACTTC TCTTTCCAGA CTGCTGACAA ATTGTACTTT GTCCTAGACT 
4801 ACATTAATGG TGGAGAGGTG AGCAGGGGGG ATAGAAGTCA ACTCTTAGTG 
4851 TCTCTGCACA GCCTGCTTTG TTTTAGTTTG AGAAAAAAGT TTTCAAAGAT 
4901 TTTTGGTGGG GAGAATGTTA CCAGAATTAG CAT7TCCTTC AACCTGTCAG 
4951 GTTTATAGTT AATAGATTAC TTGGGGCCAC TTCCTGCAGT TGTTCTTTTG 
5001 CTGTGTATGT CAAAACTAAT TAAATTCATT TGCAACCCAG AATGACTTTG 
5051 TTCTGTCTCC TGCAGTTGTT CTACCATCTC CAGAGGGAAC GCTGCTTCCT 
5101 GGAACCACGG GCTCGTTTCT ATGCTGCTGA AATAGCCAGT GCCTTGGGCT 
5151 ACCTGCATTC ACTGAACATC GTTTATAGGT AAGCCTGAGA GCTCTTCAGG 
5201 CTACCAGTTT TGGTATAAAG GAGACGTAGC ACTGGCTGTT TCATAGGGCC 
5251 TTAAAATAAT TTGTGTTTAT TTGCAACTTG GTTGCCTAAA ACCAGATCCC 
5301 CTAGCACGTG AGCTGGCTTG ACTTAAGTGC CAAGGGGGAA CCAGCCAAGT 
5351 AGGATTGTGC CTAATCCAGA ATAGATGAGC AGAACAAGGG CTCCCI 1 1 1 1 
5401 TCTTCACTAC ACAACTACAG TGAACCTAAA ATGCCTCTAA TACCTTTAGC 
5451 AATTATCTTT AAGAGGATAT CTTATGAAGT GAAATTAACT TGTGCAACTA 
5501 CTTTTCTATT CACI MIMA CAGAGACTTA AAACCAGAGA ATATTTTGCT 
5551 AGATTCACAG GGACACATTG TCCTTACTGA CTTCGGACTC TGCAAGGAGA 
5601 ACATTGAACA CAACAGCACA ACATCCACCT TCTGTGGCAC GCCGGAGGTA 
5651 GGCGCTGTCT TGGTTTGGTG CCTGGTTTAC CCCCGCCTTC CAAGAGAGAG 
5701 ATGTACAATC ATGCACTTAA CTACCAAAAA GAGTAAACTC CTCTCAGAGA 
5751 CTTCTTAATA CAGTTCAGTG CAAATAAAAT ACATTTGCTG TTTGATGTAG 
5801 CATGAGAAAT CCCAAGTCCT TCTGTTCCTT TACTGAAAAG TAGCTGTTTG 
5851 TAAGTAAGAT CTGCATCATA AAAACTTTCT AAATCCCTAA GTAAGAGATA 
5901 TCAAGTGCCC AGCAGTTTCC TAAATGTCAG TACACATAGG TAGCCAGTCA 
5951 CCCTCAAAAA GTCCAGCAGT TTTATCAGGA AGGAATCTAA AGATATCTAT 
6001 CTTCCAAGCT GGCTCTGGGT CTCTCAGCTT TTTCAAACTA AATGTGTGGT 
6051 CGTGGGATTG CTTGCTTTCG CAGGTTCTAA ACGCTGTTTC CCTGGTCTGT 
6101 TTTTCAGTAT CTCGCACCTG AGGTGCTTCA TAAGCAGCCT TATGACAGGA 
6151 CTGTGGACTG GTGGTGCCTG GGAGCTGTCT TGTATGAGAT GCTGTATGGC 
6201 CTGGTGAGTG GCACATTGGG AACCATGGAA CACTGCCTGC TCCCTACAAT 
6251 ATTGCCTTCA CACAGCCCAT GCTTGGCCAT GGTGTCTTGC CCTTACCAGT 
6301 ACGCTTATCA AAAGCAGCTA AGAGGCATAT TGGTTATTTT ATAGTTCATA 
6351 AGAATAATCA CTTACCTGGT TCTTTTGTGC ATTTCACATT TTACTAGATA 
6401 GGACCACATT GAACCTGTGT GGTGGTGAAA AACTACCACT TATTAACATC 
6451 TACCCCCTCA CCCTCCACAC ACACACACAC AAACACACAC ACGGGTTGCA 
6501 AAGTAGACAC TTAAATAGCA AGGGAAAAGA AAGCATTGAG GTGGGGAGAG 
6551 TTTCTCAAAT CGAGCCTAAT ATTTATTGCC GTTTATATCT TTTTCTCTAC 
6601 TGGTAATGTG TGCCATATGA AACTTCCAAT TAAGTCTAAA GTAATTTTCC 
6651 CCTTCTTTCA GCCGCCTTTT TATAGCCGAA ACACAGCTGA AATGTACGAC 
6701 AACATTCTGA ACAAGCCTCT CCAGCTGAAA CCAAATATTA CAAATTCCGC 
6751 AAGACACCTC CTGGAGGGCC TCCTGCAGAA GGACAGGACA AAGCGGCTCG 
6801 GGGCCAAGGA TGACTTCGTG AGTGATGTTT TCCTGTCCTC CTGGGCCGGC 
6851 CGGGACGTGC ACTAGACCTC CCTGCCCTTA TTGAATGCAC CTGTCTAAAT 
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6901 TAATCTTGGG TTTCTTATCA ACAGATGGAG ATTAAGAGTC ATGTCI IU I 
6951 CTCCTTAATT AACTGGGATG ATCTCATTAA TAAGAAGATT ACTCCCCCTT 
7001 TTAACCCAAA TGTGGTGAGT ATCTGTCTCT CTTCTAAGTA TAGAGAAGCC 
7051 CAAAGGGCAT TTATTTTAAT TCAGAATTGT CTGGGGGAGG GTTGGAAGGA 

7101 atacattggc agatgtntc tccataaacc tgttatttta cctacataaa 
7151 aagcacattt ttgtgtccca acaaggctcc cataattttt agacacattt 
7201 atcaattcga agcaccaaaa ggcaacaagt gaacattatt cttatgttta 
7251 actgtgtgta gccttttgag attttgtgct tgaagtgggt gattatggaa 
7301 gttgatataa gacttaaact tggtatttaa agcctggtca agatttccct 
7351 gtcctgtgtc tagtgtgagt tcttgacaag agtgi 1 1 1 ic ccttcccgtc 
7401 acagagtggg cccaacgacc tacggcactt tgaccccgag tttaccgaag 
7451 agcctgtccc caactccatt ggcaagtccc ctgacagcgt cctcgtcaca 
7501 gccagcgtca aggaagctgc cgaggctttc ctaggctttt cctatgcgcc 
7551 tcccacggac tctttcctct gaaccctgtt agggcttggt tttaaaggat 
7601 tttatgtgtg tttccgaatg ttttagttag ccttttggtg gagccgccag 
7651 ctgacaggac atcttacaag agaatttgca catctctgga agcttagcaa 
m 7701 tcttattgca cactgttcgc tggaagcttt ttgaagagca cattctcctc 
9 7751 agtgagctca tgaggttttc atttttattc ttccttccaa cgtggtgcta 
! ;3 7801 tctctgaaac gagcgttaga gtgccgcctt agacggaggc aggagtttcg 
7851 ttagaaagcg gacgctgttc taaaaaaggt ctcctgcaga tctgtctggg 
,l« 7901 ctgtgatgac gaatattatg aaatgtgcct tttctgaaga gattgtgtta 

■'Jj 7951 GCTCCAAAGC TTTTCCTATC GCAGTGTTTC AG I ICI 1 1 AT TTTCCCTTGT 

4 8001 GGATATGCTG TGTGAACCGT CGTGTGAGTG TGGTATGCCT GATCACAGAT 
8051 GGATTTTGTT ATAAGCATCA ATGTGACACT TGCAGGACAC TACAACGTGG 
Q 8101 GACATTGTTT GTTTCTTCCA TATTTGGAAG ATAAATTTAT GTGTAGACTT 

i'U 8151 TTTTGTAAGA TACGGTTAAT AACTAAAATT TATTGAAATG GTCTTGCAAT 

9 8201 GACTCGTATT CAGATGCTTA AAGAAAGCAT TGCTGCTACA AATATTTCTA 

!*J 8251 TTTTTAGAAA GGGTTTTTAT GGACCAATGC CCCAGTTGTC AGTCAGAGCC 

:;:fi 8301 GTTGGTGTTT TTCATTGTTT AAAATGTCAC CTGTAAAATG GGCATTATTT 

1 a 8351 ATGI 1 1 1 1 1 1 TTTTGCATTC CTGATAATTG TATGTATTGT ATAAAGAACG 

8401 TCTGTACATT GGGTTATAAC ACTAGTATAT TTAAACTTAC AGGCTTATTT 
8451 GTAATGTAAA CCACCATTTT AATGTACTGT AATTAACATG GTTATAATAC 
8501 GTACAATCCT TCCCTCATCC CATCACACAA CI I 1 1 I I IGT GTGTGATAAA 
8551 CTGATTTTGG TTTGCAATAA AACCTTGAAA AATATTTACA TATATTGTGT 
8601 CATGTGTTAT TTTGTATATT TTGGTTAAGG GGGTAATCAT GGGTTAGTTT 
8651 AAAATTGAAA ACCATGAAAA TCCTGCTGTA ATTTCCTGCT TAGTGGTTTG 
8701 CTCCCAACAG CAGTGGTTTC TGACTCCAGG GGAGTATAGG ATGGTCTTAA 
8751 AGCCAACCTA CGTTCCAGGC CI 1 1 1 IAGCA GCATTTTATG GTGTCTGTCA 
8801 TTCATAAATC CATCCAAGGA AATCCTTTGC AATTTACTCA TCTTGCAAGG 
8851 ATTGCTATGA AGTAATGCTT CCTGTATTTA TTGCCTGTCC TGTGAAGTTG 
8901 GACTATTTGT CCTGACATTT GGCTTGTCTT CAGTTACAGG TAATTCTTTC 
8951 CAGAAATATT TGAAAGCCTA CTCTGGGCTC TATTGCGAGT GCTCAGGATA 
9001 TCGTAGTGGA CAAAGCAGAC AACTTCGCCC TTCCAGAGCC TGATGAAGAA 
9051 GGCCGACCTA AAGCAGTTAG TTGAGATGGA AATTGAGAAA TAGTCTGTGA 
9101 AGTTTAGGAG AATGCCACAC AAGAGGGTGA GAAI I I 1 1 I I 1 1 I I I I I I I I 
9151 II II III I IG AGACACGGTC TTACTCTGTC GCCCAGGCTG GAGTGCAGTG 
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9201 GTGTGATCTT GGTTCACTGC AGCCTCCGCC TCCTGGGTTC ATGTGATCCC 
9251 CCCATCTCAG CCTCCTGAGT AGCTGGGACT ACAGGCATGC ACCACCATGC 
9301 CTGGCTAATT TTTGTATTTT AGTAGAGATG GGATTTCACC ATGTGGGCCA 
9351 GGCTGGTCTC GAATCCCTGG CCTCAAGTGA TCTGTCTGCC TCGGGTCCC 
9401 TAAGTGCTQG GGAGAATGTT TTAAATAAGT GGATATGTTC CCAAAAAGCT 
9451 GACCTGGCTG GGACATCTGG TTTCTGAGAG TACCTGGAGT TGACCCAGGT 
9501 CTAGAGTGAG CTCAGTAAAG GGACCCTGAA GGAGCTCATC CCTAGCTTGG 
9551 ACTGAAGCTT CTTGAGCCAG TGTCTACCTA GCACCCTAAG GGCCCAGCAG 
9601 GCTCTGGGGC TGTGTGGCAG AGCCCACTCC TAGAGCTCAC CCCACTGTGA 
9651 TATTACCTGT GGGAGAAAGC GAGGTGGCAC CATCCTTGGA GATCTTGAGT 
9701 CCAAAGGTTT GGACI 1 1 1 IC ACTCTTCTAG GCOTCCACA CAAATACTTA 
9751 ACAAATAATC AGGGAATCCC CAAACAGTTG ATGTTGCTGC TGCCTTAATT 
9801 GCAAAAGCAC CCTGTAGGCC TGCTGCACCC CCGCTACCCT GACCTTCCAG 
9851 TTCGCACAGG GATTTCCCCA AGGGAAAGCT GTGAGCTTTT TTCCTCTTAT 
9901 CCTTGCTCTT GGGTCTCACC TCACTTTGCC TCAGTCCCCC TCTCCTACCC 
9951 CACAAGGTTT CCAAGGGCCA AACAGGTGTT CAGAGATAAC CGAGTTCTTC 
10001 TCCCTCATGA TCTAATGAAG GAAGAAGATG AAAACGAGTC GATAGCTTTT 
10051 TGCTCAAGGT GGGCCACCGG TCATGCTCTG CTGTTGACTT ACTGCTCTAC 
10101 AGGCATTAGC TACGTGTTCA ATTCCCTACC GGGCCCAGTT GACAAATAAA 
10151 GAGTCCAAAG CAAGGCCAGG CACGGTGGCT CACGCTTGTA ATCCCAGCAC 
10201 TTTGGGAGGC CGAGGCGGGC AGATCACGAG GTCAGGAGAT CGAGACCATC 
10251 CTGGCTAACA TGGTGAAACC CCGTCTCTAC TAAAAATACA AAAAAATTAG 
10301 CCGGGCGTGG TGGTGGGCGC CTGTAGTGCC AGCTACTCGG GAGGCTGAGG 
10351 CAGGAGAATG GCGTGAACCA GGGAGGCGGA GCTTGCAGTG AGCCGAGATC 
10401 GCACCACTGC ACTCCAGCCT GGGCGACAGA GCAAGACTCT GTCTCAAAAA 
10451 ACAAAACAAA ACAAAAGCAT GTATTTTCCT ATTAAAGATT GATGCCGGCT 
10501 CTAACATAGA GACTCATTGC ATATTCCCCC TCATTCTCAT TCTCAATAAC 
10551 AGTTATGAAT TCCTCCTCGA ACA (SEQ ID NO: 3) 
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Sim4 results: 
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Position: 
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Position: 
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Position: 
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Position: 
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Position: 



1-118) 

119-194) 

195-270) 

271-375) 

376-459) 

460-591) 

592-704) 

705-828) 

829-924) 

925-1080) 

1081-1170) 

1171-1338) 



CHROMOSOME MAP POSITION: 
chromosome 6 



ALLELIC VARIANTS (SNPs) : 
DNA 

Position Major Minor Domain 
738 A G Intron 

Context: 

DNA 

Position 

738 GACATACGGCTGG(^CTGGCMTAACAAA 

MGCAGACATGGGCCAGTTACTGTGATTTGAMTTGCCT AMCATTGCrrTAGGTTGGCA 
TGTCAATTTCAGGT ACT AGT(j I I I I I I I IGI I I I IU I 1 1 IU 1 1 IU I 1 1 IG I I IGI I I 
(jl I IGI 1 1 1 GAGACGGAGTCTCGCTCTGTTGCCAGGCTGGAGTGCAGTGGCGTGATCT 
GCTCACTGCMCCTCCGCCTCC^^ 
[A,G] 

CTGGGACT ACAGGCGCACGCCACCACGCCTGGCTAA 1 1 1 1 1 CTATTTTCAGTAGAGACGG 
GGTTTCACCATGTTAGCCAGGATGGTCrCGATCTCTTGA 
GCCTCCCAMGTGCTGGGATTACAGGCGTGAG^ 
TTTATMGTGTGGGCACTGAGCAMCTTTCCCAGCCAGACT 

TCCCTTCTCTCGGTTTGGGGCTGTTGCMCAMGC^ AGAGCT 
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